Analysis of the knowledge about pain in students of Health Sciences

Objetives

  • To evaluate whether there are statistically significant differences in knowledge about pain among students of the Physiotherapy, Medicine, Nursing and Pharmacy degrees.
  • To evaluate whether there is statistically significant difference in knowledge about pain between students of the first and fourth year of the degrees.
  • To evaluate whether there is an interaction between the course and the degree in knowledge about pain.

Methodology

Cross-sectional observational study. The following questionnaires have been applied to the participating students:

  • RNPQ (Reliability and Pain Knowledge Questionnaire). Questionnaire with 12 questions with three possible answers (True, False, Don’t know). 1 point is scored for each correct answer and 0 points for each incorrect answer or “Don’t know”. The total score ranges from 0 to 12 points, with higher scores indicating greater knowledge about pain.

  • HC-PAIRS (Health care providers’ attitudes and beliefs about functional impairments and chronic back pain). Questionnaire with 15 questions with 7 possible answers on a liker scale (from Totally disagree to Totally agree). It is scored from 1 to 7 points. The total score ranges from 15 to 105, with lower scores indicating greater knowledge about pain.

# Packages loading.
library(tidyverse)
library(knitr)
library(broom)
library(vtable)
library(tidymodels)
library(ggpubr)
library(car)
library(rstatix)
library(emmeans)
library(patchwork)
# Data loading.
df <- read_csv("datos/datos-2024-11-25.csv") |>
    # Select the variables of interest.
    select(Degree, Course, Gender, RNPQ, HC_PAIRS, starts_with("X"), starts_with("Y")) |> 
    # Convert the categorical variables to factors.
    mutate(Degree = factor(Degree), Course = factor(Course), Gender = factor(Gender)) |> 
    # Create a new variable with the combination of the degree and the course.
    mutate(Degree_Course = interaction(Degree, Course))

# Show 10 random rows of the data frame.
sample_n(df, 10) |> 
    kable()
Degree Course Gender RNPQ HC_PAIRS X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8 Y9 Y10 Y11 Y12 Y13 Y14 Y15 Degree_Course
nursing last f 5 53 0 0 1 1 1 0 0 1 0 0 1 0 4 5 3 3 6 2 5 2 2 6 5 2 4 5 3 nursing.last
pharmacy first f 2 57 0 0 0 1 0 0 0 0 0 1 0 0 6 4 4 1 5 2 4 5 5 7 2 4 4 1 1 pharmacy.first
nursing first m 4 74 1 0 0 1 0 0 0 0 0 0 1 1 5 6 6 6 3 3 5 7 6 6 4 5 6 5 7 nursing.first
medicine first f 5 68 0 0 0 0 1 0 0 1 0 1 1 1 2 3 4 2 4 3 4 7 5 7 3 5 6 4 5 medicine.first
physiotherapy last m 9 52 1 0 1 0 1 0 1 1 1 1 1 1 4 2 2 3 4 4 4 4 4 7 3 1 4 5 3 physiotherapy.last
medicine last f 9 50 0 0 1 1 1 1 0 1 1 1 1 1 7 3 2 2 2 6 4 2 2 6 5 1 7 6 5 medicine.last
nursing first f 5 69 0 0 0 1 1 1 0 1 0 0 1 0 4 7 7 1 5 1 4 6 7 7 5 5 7 6 1 nursing.first
physiotherapy last f 6 64 1 0 0 0 1 0 1 1 1 0 1 0 3 4 4 7 6 2 5 7 4 5 5 2 5 4 1 physiotherapy.last
nursing first f 9 73 0 0 1 1 1 1 0 1 1 1 1 1 4 6 5 6 5 4 3 4 6 6 5 4 4 6 7 nursing.first
medicine first f 8 44 0 0 1 0 1 1 1 1 0 1 1 1 5 3 1 1 6 1 7 3 1 5 4 1 7 2 1 medicine.first

Sample

Variables

# Show the structure of the data frame.
vt(df)
df
Name Class Values
Degree factor 'medicine' 'nursing' 'pharmacy' 'physiotherapy'
Course factor 'first' 'last'
Gender factor 'f' 'm'
RNPQ numeric Num: 0 to 12
HC_PAIRS numeric Num: 33 to 95
X1 numeric Num: 0 to 1
X2 numeric Num: 0 to 1
X3 numeric Num: 0 to 1
X4 numeric Num: 0 to 1
X5 numeric Num: 0 to 1
X6 numeric Num: 0 to 1
X7 numeric Num: 0 to 1
X8 numeric Num: 0 to 1
X9 numeric Num: 0 to 1
X10 numeric Num: 0 to 1
X11 numeric Num: 0 to 1
X12 numeric Num: 0 to 1
Y1 numeric Num: 1 to 7
Y2 numeric Num: 1 to 7
Y3 numeric Num: 1 to 7
Y4 numeric Num: 1 to 7
Y5 numeric Num: 1 to 7
Y6 numeric Num: 1 to 7
Y7 numeric Num: 1 to 7
Y8 numeric Num: 1 to 7
Y9 numeric Num: 1 to 7
Y10 numeric Num: 1 to 7
Y11 numeric Num: 1 to 7
Y12 numeric Num: 1 to 7
Y13 numeric Num: 1 to 7
Y14 numeric Num: 1 to 7
Y15 numeric Num: 1 to 7
Degree_Course factor 'medicine.first' 'nursing.first' 'pharmacy.first' 'physiotherapy.first' 'medicine.last' and 3 more

Sample size

The number of students who participated in the study was 716.

Sample size according to the degree.

df |> 
  count(Degree) |> 
  kable()
Degree n
medicine 156
nursing 176
pharmacy 130
physiotherapy 254

Sample size according to the course.

df |> 
  count(Course) |> 
  kable()
Course n
first 390
last 326

Sample size according to the degree and course.

table(df$Degree, df$Course) |> 
  kable()
first last
medicine 91 65
nursing 80 96
pharmacy 66 64
physiotherapy 153 101

Samples size according to the gender.

df |> 
  count(Gender) |> 
  kable()
Gender n
f 527
m 189

Descriptive analysis

Descriptive statistics

According to the degree

library(vtable)
st(df, vars = c("RNPQ", "HC_PAIRS"), group = "Degree", summ = c("mean(x)",   "sd(x)", "min(x)", "pctile(x)[25]", "median(x)", "pctile(x)[75]", "max(x)"), title = "Resumen descriptivo",  group.long = TRUE,  summ.names = c("Mean", "StdDev", "Min", "Q1", "Median", "Q3", "Max"))
Resumen descriptivo
Variable Mean StdDev Min Q1 Median Q3 Max
Degree: medicine
RNPQ 7 2 1 6 7 8 12
HC_PAIRS 63 9.3 38 57 62 69 88
Degree: nursing
RNPQ 7.1 1.8 2 6 7 8 12
HC_PAIRS 66 9.2 41 60 65 71 95
Degree: pharmacy
RNPQ 6.2 2.1 0 5 6 8 10
HC_PAIRS 67 8.1 51 62 67 73 92
Degree: physiotherapy
RNPQ 7.3 1.8 3 6 7 8 12
HC_PAIRS 59 9.6 33 53 59 66 88

According to the course

library(vtable)
st(df, vars = c("RNPQ", "HC_PAIRS"), group = "Course", summ = c("mean(x)",   "sd(x)", "min(x)", "pctile(x)[25]", "median(x)", "pctile(x)[75]", "max(x)"), title = "Resumen descriptivo",  group.long = TRUE,  summ.names = c("Mean", "StdDev", "Min", "Q1", "Median", "Q3", "Max"))
Resumen descriptivo
Variable Mean StdDev Min Q1 Median Q3 Max
Course: first
RNPQ 6.4 1.9 0 5 6 8 12
HC_PAIRS 64 8.6 39 59 64 70 95
Course: last
RNPQ 7.7 1.7 3 6 8 9 12
HC_PAIRS 62 11 33 55 61 69 92

According to the degree and course

library(vtable)
st(df, vars = c("RNPQ", "HC_PAIRS"), group = "Degree_Course", summ = c("mean(x)",   "sd(x)", "min(x)", "pctile(x)[25]", "median(x)", "pctile(x)[75]", "max(x)"), title = "Resumen descriptivo",  group.long = TRUE,  summ.names = c("Mean", "StdDev", "Min", "Q1", "Median", "Q3", "Max"))
Resumen descriptivo
Variable Mean StdDev Min Q1 Median Q3 Max
Degree_Course: medicine.first
RNPQ 6.1 1.8 1 5 6 7 12
HC_PAIRS 64 8.7 44 58 63 69 88
Degree_Course: nursing.first
RNPQ 7.1 1.9 2 6 7 8 12
HC_PAIRS 66 9.3 41 61 65 71 95
Degree_Course: pharmacy.first
RNPQ 5.2 2 0 4 5 6 10
HC_PAIRS 67 7.4 51 62 66 71 87
Degree_Course: physiotherapy.first
RNPQ 6.6 1.6 3 6 7 8 10
HC_PAIRS 63 8.4 39 57 62 68 88
Degree_Course: medicine.last
RNPQ 8.2 1.4 5 7 8 9 11
HC_PAIRS 61 9.8 38 55 60 68 83
Degree_Course: nursing.last
RNPQ 7 1.8 3 6 7 8 12
HC_PAIRS 66 9.2 46 60 65 71 90
Degree_Course: pharmacy.last
RNPQ 7.1 1.6 4 6 7 8 10
HC_PAIRS 68 8.8 51 62 67 73 92
Degree_Course: physiotherapy.last
RNPQ 8.2 1.5 5 7 8 9 12
HC_PAIRS 54 9.2 33 48 54 60 80

Boxplot

df |>
    ggplot(aes(x = Degree, y = RNPQ, fill = Degree)) +
    geom_boxplot() +
    labs(title = "Distribution of the RNPQ score by degree",
         x = "Degree",
         y = "Score") +
    theme_minimal()    

df |>
    ggplot(aes(x = Course, y = RNPQ, fill = Course)) +
    geom_boxplot() +
    labs(title = "Distribution of the RNPQ score by course",
         x = "Course",
         y = "Score") +
    theme_minimal()    

df |>
    ggplot(aes(x = Degree, y = RNPQ, fill = Degree)) +
    geom_boxplot() +
    labs(title = "Distribution of the RNPQ score by degree and course",
         x = "Degree",
         y = "Score") +
    facet_grid(. ~ Course) +
    theme_minimal()    

df |>
    ggplot(aes(x = Degree, y = HC_PAIRS, fill = Degree)) +
    geom_boxplot() +
    labs(title = "Distribution of the HC-PAIRS score by degree",
         x = "Degree",
         y = "Score") +
    theme_minimal()    

df |>
    ggplot(aes(x = Course, y = HC_PAIRS, fill = Course)) +
    geom_boxplot() +
    labs(title = "Distribution of the HC-PAIRS score by course",
         x = "Course",
         y = "Score") +
    theme_minimal()    

df |>
    ggplot(aes(x = Degree, y = HC_PAIRS, fill = Degree)) +
    geom_boxplot() +
    labs(title = "Distribution of the HC-PAIRS score by degree and course",
         x = "Degree",
         y = "Score") +
    facet_grid(. ~ Course) +
    theme_minimal()    

Barplot

df |> 
    ggplot(aes(x = RNPQ, fill = Course)) +
    geom_bar(aes(y = ..count../sum(..count..))) +
    labs(title = "Distribution of the RNPQ score by course",
         x = "RNPQ Score",
         y = "Frequency") +
    facet_grid(Course ~ .) +
    theme_minimal()

df |> 
    ggplot(aes(x = RNPQ, fill = Degree)) +
    geom_bar(aes(y = ..count../sum(..count..))) +
    labs(title = "Distribution of the RNPQ score by degree",
         x = "RNPQ Score",
         y = "Frequency") +
    facet_grid(Degree ~ .) +
    theme_minimal()

Histogram

df |> 
    ggplot(aes(x = HC_PAIRS, fill = Course)) +
    geom_histogram(aes(y = ..count../sum(..count..)), color = "white") +
    labs(title = "Distribution of the HC-PAIRS score by course",
         x = "HC-PAIRS score",
         y = "Frequency") +
    facet_grid(Course ~ .) +
    theme_minimal()

df |> 
    ggplot(aes(x = HC_PAIRS, fill = Degree)) +
    geom_histogram(aes(y = ..count../sum(..count..)), color = "white") +
    labs(title = "Distribution of the HC-PAIRS score by degree",
         x = "HC-PAIRS score",
         y = "Frequency") +
    facet_grid(Degree ~ .) +
    theme_minimal()

Means plot

ggline(df, x = "Course", y = "RNPQ", color = "Degree", add = c("mean_ci"), position = position_dodge(0.2), xlab = "Course", ylab = "RNPQ score", main = "Confidence intervals for the means of RNPQ score by degree and course", legend = "top")

Caution

It looks that the pain knowledge according to RNPQ score decreases in the last year of Nursing degree, that is very weird.

ggline(df, x = "Course", y = "HC_PAIRS", color = "Degree", add = c("mean_ci"), position = position_dodge(0.2), xlab = "Course", ylab = "RNPQ score", main = "Confidence intervals for the means of HC-PAIRS score by degree and course", legend = "top")

Caution

It looks that the pain knowledge according to HC_PAIRS decreases in the last year of Pharmacy degree, that is very weird.

Comparison of RNPQ scores

As the dependent variable si quantitative (discrete) and the dependent variables are qualitative, we perform a two-factors ANOVA test for comparing the means of RNPQ score for the groups based on degree and course.

ANOVA

anova <- aov(RNPQ ~ Degree * Course, data = df)
# Use the type III sum of squares as the groups are unbalanced.
Anova(anova, type = "III") |>
    tidy() |>
    kable()
term sumsq df statistic p.value
(Intercept) 3360.5385 1 1146.74189 0
Degree 154.8222 3 17.61038 0
Course 168.4397 1 57.47796 0
Degree:Course 129.2890 3 14.70608 0
Residuals 2074.8010 708 NA NA

The ANOVA test shows both, the degree and the course, have a statistically significant effect on the scores of the RNPQ questionnaire, and also the interaction between the degree and the course has a statistically significant effect (p-value < 0.01).

ANOVA assumptions

Now we study the residuals of the model to check the ANOVA assumptions are meet.

First we check normality of residuals.

residuals <- residuals(anova)
par(mfrow = c(1, 2))  
hist(residuals)
qqnorm(residuals, main = "Q-Q Plot of Residuals")
qqline(residuals, col = "red", lwd = 2)

According to the charts there is no significant departure from normality.

After we check homocedasticity.

leveneTest(RNPQ ~ Degree * Course, data = df) |> 
    tidy() |> 
    kable()
statistic p.value df df.residual
0.8952004 0.5096458 7 708

The variances of the scores of the RNPQ questionnaire are homogeneous in all the degree-course groups (p-value > 0.05).

Post-hoc test

As the interaction of degree and course is significant, we estimate the mean for each degree-course group.

marginal_means <- emmeans(anova, ~ Degree * Course)
marginal_means |> 
    tidy(conf.int = TRUE) |> 
    select(-statistic, -p.value) |> 
    arrange(desc(estimate)) |> 
    kable()
Degree Course estimate std.error df conf.low conf.high
physiotherapy last 8.247525 0.1703378 708 7.913097 8.581952
medicine last 8.184615 0.2123317 708 7.767740 8.601491
pharmacy last 7.140625 0.2139842 708 6.720506 7.560744
nursing first 7.137500 0.1913932 708 6.761734 7.513266
nursing last 7.041667 0.1747173 708 6.698641 7.384693
physiotherapy first 6.614379 0.1383967 708 6.342662 6.886096
medicine first 6.076923 0.1794531 708 5.724599 6.429247
pharmacy first 5.196970 0.2107170 708 4.783265 5.610675
marginal_means |> 
    as_tibble() |> 
    mutate(DegreeCourse = paste(Degree, Course, sep = "-")) |> 
    mutate(DegreeCourse = reorder(DegreeCourse, emmean)) |> 
    ggplot(aes(x = emmean, y = DegreeCourse)) +
    geom_errorbar(aes(xmin = lower.CL, xmax = upper.CL), size = 1, width = 0.2, color = myblue) +
    geom_point(size = 3, color = myred) +
    labs(title = "Estimated mean of RNPQ score",
       x = "Estimated Mean",
       y = "Degree - Course") +
    theme_minimal()

Now we compare the means of all the groups by pairs.

pairs(marginal_means) |>
    as_tibble() |> 
    arrange(p.value) |> 
    kable()
contrast estimate SE df t.ratio p.value
medicine first - medicine last -2.1076923 0.2780075 708 -7.5814223 0.0000000
medicine first - physiotherapy last -2.1706017 0.2474234 708 -8.7728218 0.0000000
nursing first - pharmacy first 1.9405303 0.2846630 708 6.8169384 0.0000000
pharmacy first - medicine last -2.9876457 0.2991428 708 -9.9873552 0.0000000
pharmacy first - nursing last -1.8446970 0.2737294 708 -6.7391251 0.0000000
pharmacy first - physiotherapy last -3.0505551 0.2709550 708 -11.2585300 0.0000000
physiotherapy first - physiotherapy last -1.6331457 0.2194735 708 -7.4411982 0.0000000
pharmacy first - pharmacy last -1.9436553 0.3003180 708 -6.4719914 0.0000000
physiotherapy first - medicine last -1.5702363 0.2534530 708 -6.1953745 0.0000000
pharmacy first - physiotherapy first -1.4174094 0.2521018 708 -5.6223689 0.0000008
nursing last - physiotherapy last -1.2058581 0.2440104 708 -4.9418299 0.0000265
nursing first - physiotherapy last -1.1100248 0.2562154 708 -4.3323892 0.0004467
medicine last - nursing last 1.1429487 0.2749744 708 4.1565643 0.0009437
pharmacy last - physiotherapy last -1.1068998 0.2735035 708 -4.0471136 0.0014796
medicine first - nursing first -1.0605769 0.2623638 708 -4.0423900 0.0015082
medicine first - nursing last -0.9647436 0.2504587 708 -3.8519072 0.0031988
medicine first - pharmacy last -1.0637019 0.2792716 708 -3.8088440 0.0037713
nursing first - medicine last -1.0471154 0.2858604 708 -3.6630312 0.0064874
medicine last - pharmacy last 1.0439904 0.3014531 708 3.4631931 0.0131278
medicine first - pharmacy first 0.8799534 0.2767762 708 3.1792956 0.0329834
medicine first - physiotherapy first -0.5374560 0.2266210 708 -2.3716076 0.2570125
nursing first - physiotherapy first 0.5231209 0.2361886 708 2.2148445 0.3439594
physiotherapy first - pharmacy last -0.5262459 0.2548389 708 -2.0650139 0.4388373
physiotherapy first - nursing last -0.4272876 0.2228897 708 -1.9170363 0.5395543
nursing first - nursing last 0.0958333 0.2591477 708 0.3698020 0.9999554
nursing last - pharmacy last -0.0989583 0.2762524 708 -0.3582172 0.9999641
medicine last - physiotherapy last -0.0629094 0.2722126 708 -0.2311038 0.9999982
nursing first - pharmacy last -0.0031250 0.2870899 708 -0.0108851 1.0000000

As many of this comparisons make no sense, we perform a pairwise comparison of degrees by course.

emmeans(anova, ~ Degree | Course) |> 
    pairs() |> 
    as_tibble() |> 
    arrange(Course, p.value) |> 
    kable()
contrast Course estimate SE df t.ratio p.value
nursing - pharmacy first 1.9405303 0.2846630 708 6.8169384 0.0000000
pharmacy - physiotherapy first -1.4174094 0.2521018 708 -5.6223689 0.0000002
medicine - nursing first -1.0605769 0.2623638 708 -4.0423900 0.0003415
medicine - pharmacy first 0.8799534 0.2767762 708 3.1792956 0.0083666
medicine - physiotherapy first -0.5374560 0.2266210 708 -2.3716076 0.0835390
nursing - physiotherapy first 0.5231209 0.2361886 708 2.2148445 0.1201684
nursing - physiotherapy last -1.2058581 0.2440104 708 -4.9418299 0.0000058
medicine - nursing last 1.1429487 0.2749744 708 4.1565643 0.0002120
pharmacy - physiotherapy last -1.1068998 0.2735035 708 -4.0471136 0.0003349
medicine - pharmacy last 1.0439904 0.3014531 708 3.4631931 0.0031685
nursing - pharmacy last -0.0989583 0.2762524 708 -0.3582172 0.9842483
medicine - physiotherapy last -0.0629094 0.2722126 708 -0.2311038 0.9956502
emmeans(anova, ~ Degree | Course) |> 
    pairs() |> 
    confint() |> 
    as_tibble() |> 
    filter(Course == "first") |> 
    mutate(contrast = reorder(contrast, estimate)) |> 
    ggplot(aes(x = estimate, y = contrast)) +
    geom_errorbar(aes(xmin = lower.CL, xmax = upper.CL), size = 1, width = 0.2, color = myblue) +
    geom_point(size = 3, color = myred) +
    geom_vline(xintercept = 0, linetype = "dashed") +
    labs(title = "Difference of RNPQ means in the first course",
       x = "Estimated Mean difference",
       y = "Degrees") +
    theme_minimal()

In the first course there are significant differences between nursing and pharmacy, pharmacy and physioterapy, medicine and nursing and medicine and pharmacy.

emmeans(anova, ~ Degree | Course) |> 
    pairs() |> 
    confint() |> 
    as_tibble() |> 
    filter(Course == "last") |> 
    mutate(contrast = reorder(contrast, estimate)) |> 
    ggplot(aes(x = estimate, y = contrast)) +
    geom_errorbar(aes(xmin = lower.CL, xmax = upper.CL), size = 1, width = 0.2, color = myblue) +
    geom_point(size = 3, color = myred) +
    geom_vline(xintercept = 0, linetype = "dashed") +
    labs(title = "Difference of RNPQ means in the last course",
       x = "Estimated Mean difference",
       y = "Degrees") +
    theme_minimal()

In the last course, there are significant differences between nursing and physiotherapy, medicine and nursing, pharmacy and physiotherapy and medicine and nursing.

In the same way, we compare the courses for each degree.

emmeans(anova, ~ Course | Degree) |> 
    pairs() |> 
    as_tibble() |> 
    arrange(p.value) |> 
    kable()
contrast Degree estimate SE df t.ratio p.value
first - last medicine -2.1076923 0.2780075 708 -7.581422 0.0000000
first - last physiotherapy -1.6331457 0.2194735 708 -7.441198 0.0000000
first - last pharmacy -1.9436553 0.3003180 708 -6.471991 0.0000000
first - last nursing 0.0958333 0.2591477 708 0.369802 0.7116406
emmeans(anova, ~ Course | Degree) |> 
    pairs() |> 
    confint() |> 
    as_tibble() |> 
    mutate(Degree = reorder(Degree, estimate)) |> 
    ggplot(aes(x = estimate, y = Degree)) +
    geom_errorbar(aes(xmin = lower.CL, xmax = upper.CL), size = 1, width = 0.2, color = myblue) +
    geom_point(size = 3, color = myred) +
    geom_vline(xintercept = 0, linetype = "dashed") +
    labs(title = "Difference between the first and last courses means by degrees",
       x = "Estimated Mean difference",
       y = "Degrees") +
    theme_minimal()

There are significant differences between the first and the last courses in medicine, pharmacy and physiotherapy, but not in nursing.

Comparison of HC-PAIRS scores

ANOVA

anova <- aov(HC_PAIRS ~ Degree * Course, data = df)

anova |>
    tidy() |>
    kable()
term df sumsq meansq statistic p.value
Degree 3 7264.443 2421.48106 30.90401 0e+00
Course 1 1952.302 1952.30244 24.91615 8e-07
Degree:Course 3 2812.416 937.47194 11.96443 1e-07
Residuals 708 55475.279 78.35491 NA NA

The ANOVA test shows both, the degree and the course, have a statistically significant effect on the scores of the HC-PAIRS questionnaire, and also the interaction between the degree and the course has a statistically significant effect (p-value < 0.01).

ANOVA assumptions

Now we study the residuals of the model to check the ANOVA assumptions are meet.

First we check normality of residuals.

residuals <- residuals(anova)
par(mfrow = c(1, 2))  
hist(residuals)
qqnorm(residuals, main = "Q-Q Plot of Residuals")
qqline(residuals, col = "red", lwd = 2)

According to the charts there is no significant departure from normality.

After we check homocedasticity.

leveneTest(HC_PAIRS ~ Degree * Course, data = df) |> 
    tidy() |> 
    kable()
statistic p.value df df.residual
0.7485648 0.6308148 7 708

The variances of the scores of the HC_PAIRS questionnaire are homogeneous in all the degree-course groups (p-value > 0.05).

Post-hoc test

As the interaction of degree and course is significant, we estimate the mean for each degree-course group.

marginal_means <- emmeans(anova, ~ Degree * Course)
marginal_means |> 
    tidy(conf.int = TRUE) |> 
    select(-statistic, -p.value) |> 
    arrange(desc(estimate)) |> 
    kable()
Degree Course estimate std.error df conf.low conf.high
pharmacy last 68.20312 1.1064789 708 66.03075 70.37550
pharmacy first 66.63636 1.0895851 708 64.49716 68.77557
nursing first 66.00000 0.9896648 708 64.05697 67.94303
nursing last 65.93750 0.9034362 708 64.16377 67.71123
medicine first 64.01099 0.9279240 708 62.18918 65.83280
physiotherapy first 62.79085 0.7156281 708 61.38584 64.19586
medicine last 60.81538 1.0979345 708 58.65979 62.97098
physiotherapy last 54.38614 0.8807901 708 52.65687 56.11541
marginal_means |> 
    as_tibble() |> 
    mutate(DegreeCourse = paste(Degree, Course, sep = "-")) |> 
    mutate(DegreeCourse = reorder(DegreeCourse, emmean)) |> 
    ggplot(aes(x = emmean, y = DegreeCourse)) +
    geom_errorbar(aes(xmin = lower.CL, xmax = upper.CL), size = 1, width = 0.2, color = myblue) +
    geom_point(size = 3, color = myred) +
    labs(title = "Estimated means of HC_PAIRS score",
       x = "Estimated Mean",
       y = "Degree - Course") +
    theme_minimal()

Now we compare the means of all the groups by pairs.

pairs(marginal_means) |>
    as_tibble() |> 
    arrange(p.value) |> 
    kable()
contrast estimate SE df t.ratio p.value
medicine first - physiotherapy last 9.6248504 1.279388 708 7.5230100 0.0000000
nursing first - physiotherapy last 11.6138614 1.324850 708 8.7661705 0.0000000
pharmacy first - physiotherapy last 12.2502250 1.401066 708 8.7435011 0.0000000
physiotherapy first - physiotherapy last 8.4047111 1.134863 708 7.4059234 0.0000000
nursing last - physiotherapy last 11.5513614 1.261740 708 9.1551032 0.0000000
pharmacy last - physiotherapy last 13.8169864 1.414244 708 9.7698729 0.0000000
medicine last - pharmacy last -7.3877404 1.558767 708 -4.7394759 0.0000704
medicine last - physiotherapy last 6.4292460 1.407569 708 4.5676230 0.0001566
physiotherapy first - pharmacy last -5.4122753 1.317733 708 -4.1072638 0.0011574
pharmacy first - medicine last 5.8209790 1.546821 708 3.7631880 0.0044806
medicine last - nursing last -5.1221154 1.421850 708 -3.6024304 0.0080719
nursing first - medicine last 5.1846154 1.478140 708 3.5075276 0.0112716
pharmacy first - physiotherapy first 3.8455140 1.303579 708 2.9499653 0.0645749
medicine first - pharmacy last -4.1921360 1.444070 708 -2.9030003 0.0734847
physiotherapy first - nursing last -3.1466503 1.152528 708 -2.7302161 0.1152759
nursing first - physiotherapy first 3.2091503 1.221294 708 2.6276632 0.1477160
medicine first - medicine last 3.1956044 1.437534 708 2.2229770 0.3391079
medicine first - pharmacy first -2.6253746 1.431167 708 -1.8344294 0.5966723
nursing last - pharmacy last -2.2656250 1.428458 708 -1.5860633 0.7587116
physiotherapy first - medicine last 1.9754651 1.310566 708 1.5073371 0.8035418
medicine first - nursing last -1.9265110 1.295083 708 -1.4875579 0.8141205
nursing first - pharmacy last -2.2031250 1.484497 708 -1.4840884 0.8159461
medicine first - nursing first -1.9890110 1.356643 708 -1.4661274 0.8252497
medicine first - physiotherapy first 1.2201393 1.171822 708 1.0412327 0.9679537
pharmacy first - pharmacy last -1.5667614 1.552898 708 -1.0089276 0.9731072
pharmacy first - nursing last 0.6988636 1.415412 708 0.4937526 0.9996880
nursing first - pharmacy first -0.6363636 1.471948 708 -0.4323274 0.9998717
nursing first - nursing last 0.0625000 1.340013 708 0.0466414 1.0000000

As many of this comparisons make no sense, we perform a pairwise comparison of degrees by course.

emmeans(anova, ~ Degree | Course) |> 
    pairs() |> 
    as_tibble() |> 
    arrange(Course, p.value) |> 
    kable()
contrast Course estimate SE df t.ratio p.value
pharmacy - physiotherapy first 3.8455140 1.303579 708 2.9499653 0.0172596
nursing - physiotherapy first 3.2091503 1.221294 708 2.6276632 0.0434728
medicine - pharmacy first -2.6253746 1.431167 708 -1.8344294 0.2580032
medicine - nursing first -1.9890110 1.356643 708 -1.4661274 0.4586390
medicine - physiotherapy first 1.2201393 1.171822 708 1.0412327 0.7252300
nursing - pharmacy first -0.6363636 1.471948 708 -0.4323274 0.9729063
nursing - physiotherapy last 11.5513614 1.261740 708 9.1551032 0.0000000
pharmacy - physiotherapy last 13.8169864 1.414244 708 9.7698729 0.0000000
medicine - pharmacy last -7.3877404 1.558767 708 -4.7394759 0.0000154
medicine - physiotherapy last 6.4292460 1.407569 708 4.5676230 0.0000344
medicine - nursing last -5.1221154 1.421850 708 -3.6024304 0.0019108
nursing - pharmacy last -2.2656250 1.428458 708 -1.5860633 0.3872213
emmeans(anova, ~ Degree | Course) |> 
    pairs() |> 
    confint() |> 
    as_tibble() |> 
    filter(Course == "first") |> 
    mutate(contrast = reorder(contrast, estimate)) |> 
    ggplot(aes(x = estimate, y = contrast)) +
    geom_errorbar(aes(xmin = lower.CL, xmax = upper.CL), size = 1, width = 0.2, color = myblue) +
    geom_point(size = 3, color = myred) +
    geom_vline(xintercept = 0, linetype = "dashed") +
    labs(title = "Difference of HC_PAIRS means in the first course",
       x = "Estimated Mean difference",
       y = "Degrees") +
    theme_minimal()

In the first course there are significant differences between pharmacy and physiotherapy and between nursing and physioterapy.

emmeans(anova, ~ Degree | Course) |> 
    pairs() |> 
    confint() |> 
    as_tibble() |> 
    filter(Course == "last") |> 
    mutate(contrast = reorder(contrast, estimate)) |> 
    ggplot(aes(x = estimate, y = contrast)) +
    geom_errorbar(aes(xmin = lower.CL, xmax = upper.CL), size = 1, width = 0.2, color = myblue) +
    geom_point(size = 3, color = myred) +
    geom_vline(xintercept = 0, linetype = "dashed") +
    labs(title = "Difference of HC_PAIRS means in the last course",
       x = "Estimated Mean difference",
       y = "Degrees") +
    theme_minimal()

In the last course, there are significant differences between pharmacy and physiotherapy, nursing and physiotherapy, medicine and pharmacy, medicine and physiotherapy and nursing and pharmacy.

In the same way, we compare the courses for each degree.

emmeans(anova, ~ Course | Degree) |> 
    pairs() |> 
    as_tibble() |> 
    arrange(p.value) |> 
    kable()
contrast Degree estimate SE df t.ratio p.value
first - last physiotherapy 8.404711 1.134863 708 7.4059234 0.0000000
first - last medicine 3.195604 1.437534 708 2.2229770 0.0265324
first - last pharmacy -1.566761 1.552898 708 -1.0089276 0.3133540
first - last nursing 0.062500 1.340013 708 0.0466414 0.9628122
emmeans(anova, ~ Course | Degree) |> 
    pairs() |> 
    confint() |> 
    as_tibble() |> 
    mutate(Degree = reorder(Degree, estimate)) |> 
    ggplot(aes(x = estimate, y = Degree)) +
    geom_errorbar(aes(xmin = lower.CL, xmax = upper.CL), size = 1, width = 0.2, color = myblue) +
    geom_point(size = 3, color = myred) +
    geom_vline(xintercept = 0, linetype = "dashed") +
    labs(title = "Difference between the first and last courses means by degrees",
       x = "Estimated Mean difference",
       y = "Degrees") +
    theme_minimal()

There are significant differences between the first and the last courses in pysiotherapy and medicien, but not in nursing and pharmacy.